AITopics

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence (1.00)

Neural Information Processing SystemsFeb-16-2026, 17:35:42 GMT

UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond Kun Zhou

To date, Transformer-based frameworks have demonstrated impressive results in single-image super-resolution (SISR).

artificial intelligence, feature extraction, machine learning, (16 more...)

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > France (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Neural Information Processing SystemsOct-10-2025, 10:00:53 GMT

UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond Kun Zhou

To date, Transformer-based frameworks have demonstrated impressive results in single-image super-resolution (SISR).

feature extraction, proceedings, projection space, (13 more...)

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > France (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Artificial IntelligenceJul-29-2025

Efficient Proxy Raytracer for Optical Systems using Implicit Neural Representations

Sinaei, Shiva, Zheng, Chuanjun, Akşit, Kaan, Iwai, Daisuke

Ray tracing is a widely used technique for modeling optical systems, involving sequential surface-by-surface computations, which can be computationally intensive. We propose Ray2Ray, a novel method that leverages implicit neural representations to model optical systems with greater efficiency, eliminating the need for surface-by-surface computations in a single pass end-to-end model. Ray2Ray learns the mapping between rays emitted from a given source and their corresponding rays after passing through a given optical system in a physically accurate manner. We train Ray2Ray on nine off-the-shelf optical systems, achieving positional errors on the order of 1μm and angular deviations on the order 0.01 degrees in the estimated output rays. Our work highlights the potential of neural representations as a proxy for optical raytracer.

artificial intelligence, machine learning, optical system, (12 more...)

doi: 10.1145/3721250.3742994

2507.20513

Country:

Asia > Japan (0.18)
Europe > United Kingdom (0.16)
North America > United States (0.15)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

arXiv.org Artificial IntelligenceJun-4-2025

Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection

Lu, Ruiying, Liu, Jinhan, Du, Chuan, Guo, Dandan

--T abular anomaly detection, which aims at identifying deviant samples, has been crucial in a variety of real-world applications, such as medical disease identification, financial fraud detection, intrusion monitoring, etc. Although recent deep learning-based methods have achieved competitive performances, these methods suffer from representation entanglement and the lack of global correlation modeling, which hinders anomaly detection performance. T o tackle the problem, we incorporate mask modeling and prototype learning into tabular anomaly detection. The core idea is to design learnable masks by disentangled representation learning within a projection space and extracting normal dependencies as explicit global prototypes. Specifically, the overall model involves two parts: (i) During encoding, we perform mask modeling in both the data space and projection space with orthogonal basis vectors for learning shared disentangled normal patterns; (ii) During decoding, we decode multiple masked representations in parallel for reconstruction and learn association prototypes to extract normal characteristic correlations. Our proposal derives from a distribution-matching perspective, where both projection space learning and association prototype learning are formulated as optimal transport problems, and the calibration distances are utilized to refine the anomaly scores. Quantitative and qualitative experiments on 20 tabular benchmarks demonstrate the effectiveness and interpretability of our model. Tabular data, often structured as tables in relational databases with rows signifying individual data samples and columns representing feature variables, have become indispensable across diverse real-world domains including intrusion detection in cybersecurity [1], [2], engineering [3], finance [4] etc. Tabular anomaly detection (AD), which endeavors to identify samples that diverge from a pre-defined notion of normality, playing a pivotal role in diverse scientific and industrial contexts, such as medical disease identification [5], financial fraud detection [6], cybersecurity intrusion monitoring [7], [8], and astronomy [9]. This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62306125, the Natural Science Basic Research Plan in Shaanxi Province of China under Grant [2024JC-YBQN-0661], and the Nanning Scientific Research and Technological Development Project (20231042).

artificial intelligence, data mining, machine learning, (16 more...)

2506.02757

Country:

Asia > China > Shaanxi Province (0.34)
Asia > China > Guangxi Province > Nanning (0.24)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Law Enforcement & Public Safety (0.88)
Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMay-27-2025, 09:16:45 GMT

UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond

feature extraction, lightweight single-image super-resolution, unified projection, (5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)

arXiv.org Artificial IntelligenceJan-15-2025

XMusic: Towards a Generalized and Controllable Symbolic Music Generation Framework

Tian, Sida, Zhang, Can, Yuan, Wei, Tan, Wei, Zhu, Wenjie

In recent years, remarkable advancements in artificial intelligence-generated content (AIGC) have been achieved in the fields of image synthesis and text generation, generating content comparable to that produced by humans. However, the quality of AI-generated music has not yet reached this standard, primarily due to the challenge of effectively controlling musical emotions and ensuring high-quality outputs. This paper presents a generalized symbolic music generation framework, XMusic, which supports flexible prompts (i.e., images, videos, texts, tags, and humming) to generate emotionally controllable and high-quality symbolic music. XMusic consists of two core components, XProjector and XComposer. XProjector parses the prompts of various modalities into symbolic music elements (i.e., emotions, genres, rhythms and notes) within the projection space to generate matching music. XComposer contains a Generator and a Selector. The Generator generates emotionally controllable and melodious music based on our innovative symbolic music representation, whereas the Selector identifies high-quality symbolic music by constructing a multi-task learning scheme involving quality assessment, emotion recognition, and genre recognition tasks. In addition, we build XMIDI, a large-scale symbolic music dataset that contains 108,023 MIDI files annotated with precise emotion and genre labels. Objective and subjective evaluations show that XMusic significantly outperforms the current state-of-the-art methods with impressive music quality. Our XMusic has been awarded as one of the nine Highlights of Collectibles at WAIC 2023. The project homepage of XMusic is https://xmusic-project.github.io.

emotion, music, music generation, (15 more...)

2501.08809

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-5-2023

Multi-Similarity Contrastive Learning

Mu, Emily, Guttag, John, Makar, Maggie

Given a similarity metric, contrastive methods learn a representation in which examples that are similar are pushed together and examples that are dissimilar are pulled apart. Contrastive learning techniques have been utilized extensively to learn representations for tasks ranging from image classification to caption generation. However, existing contrastive learning approaches can fail to generalize because they do not take into account the possibility of different similarity relations. In this paper, we propose a novel multi-similarity contrastive loss (MSCon), that learns generalizable embeddings by jointly utilizing supervision from multiple metrics of similarity. Our method automatically learns contrastive similarity weightings based on the uncertainty in the corresponding similarity, down-weighting uncertain tasks and leading to better out-of-domain generalization to new tasks. We show empirically that networks trained with MSCon outperform state-of-the-art baselines on in-domain and out-of-domain settings.

artificial intelligence, deep learning, machine learning, (16 more...)

2307.02712

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJun-12-2023

Bag of Image Patch Embedding Behind the Success of Self-Supervised Learning

Chen, Yubei, Bardes, Adrien, Li, Zengyi, LeCun, Yann

Self-supervised learning (SSL) has recently achieved tremendous empirical advancements in learning image representation. However, our understanding of the principle behind learning such a representation is still limited. This work shows that joint-embedding SSL approaches primarily learn a representation of image patches, which reflects their co-occurrence. Such a connection to co-occurrence modeling can be established formally, and it supplements the prevailing invariance perspective. We empirically show that learning a representation for fixed-scale patches and aggregating local patch representations as the image representation achieves similar or even better results than the baseline methods. We denote this process as BagSSL. Even with 32x32 patch representation, BagSSL achieves 62% top-1 linear probing accuracy on ImageNet. On the other hand, with a multi-scale pretrained model, we show that the whole image embedding is approximately the average of local patch embeddings. While the SSL representation is relatively invariant at the global scale, we show that locality is preserved when we zoom into local patch-level representation. Further, we show that patch representation aggregation can improve various SOTA baseline methods by a large margin. The patch representation is considerably easier to understand, and this work makes a step to demystify self-supervised representation learning.

artificial intelligence, machine learning, natural language, (17 more...)

2206.08954

Country: North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMay-16-2023

Contrastive Label Enhancement

Wang, Yifei, Zhou, Yiyang, Zhu, Jihua, Liu, Xinyuan, Yan, Wenbiao, Tian, Zhiqiang

Label distribution learning (LDL) is a new machine learning paradigm for solving label ambiguity. Since it is difficult to directly obtain label distributions, many studies are focusing on how to recover label distributions from logical labels, dubbed label enhancement (LE). Existing LE methods estimate label distributions by simply building a mapping relationship between features and label distributions under the supervision of logical labels. They typically overlook the fact that both features and logical labels are descriptions of the instance from different views. Therefore, we propose a novel method called Contrastive Label Enhancement (ConLE) which integrates features and logical labels into the unified projection space to generate high-level features by contrastive learning strategy. In this approach, features and logical labels belonging to the same sample are pulled closer, while those of different samples are projected farther away from each other in the projection space. Subsequently, we leverage the obtained high-level features to gain label distributions through a welldesigned training strategy that considers the consistency of label attributes. Extensive experiments on LDL benchmark datasets demonstrate the effectiveness and superiority of our method.

artificial intelligence, label distribution, machine learning, (14 more...)

2305.095

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.05)
Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)